SUPERFAMILY: HMMs representing all proteins of known structure. SCOP sequence searches, alignments and genome assignments
نویسندگان
چکیده
The SUPERFAMILY database contains a library of hidden Markov models representing all proteins of known structure. The database is based on the SCOP 'superfamily' level of protein domain classification which groups together the most distantly related proteins which have a common evolutionary ancestor. There is a public server at http://supfam.org which provides three services: sequence searching, multiple alignments to sequences of known structure, and structural assignments to all complete genomes. Given an amino acid or nucleotide query sequence the server will return the domain architecture and SCOP classification. The server produces alignments of the query sequences with sequences of known structure, and includes multiple alignments of genome and PDB sequences. The structural assignments are carried out on all complete genomes (currently 59) covering approximately half of the soluble protein domains. The assignments, superfamily breakdown and statistics on them are available from the server. The database is currently used by this group and others for genome annotation, structural genomics, gene prediction and domain-based genomic studies.
منابع مشابه
The SUPERFAMILY database in structural genomics.
The SUPERFAMILY hidden Markov model library representing all proteins of known structure predicts the domain architecture of protein sequences and classifies them at the SCOP superfamily level. This analysis has been carried out on all completely sequenced genomes. The ways in which the database can be useful to crystallographers is discussed, in particular with a view to high-throughput struct...
متن کاملOptimal Hidden Markov Models for All Sequences of Known Structure
Hidden Markov Models (HMMs) are probably the most powerful tool for the detection of protein sequence homology [4]. Maximization of their capabilities and biological usefulness requires the correct interpretation of their scores, and sufficient coverage of the sequence variations that exist in different protein families. Using information available from the SCOP database we investigated optimal...
متن کاملSUPERFAMILY—sophisticated comparative genomics, data mining, visualization and phylogeny
SUPERFAMILY provides structural, functional and evolutionary information for proteins from all completely sequenced genomes, and large sequence collections such as UniProt. Protein domain assignments for over 900 genomes are included in the database, which can be accessed at http://supfam.org/. Hidden Markov models based on Structural Classification of Proteins (SCOP) domain definitions at the ...
متن کاملLarge-scale comparison of protein sequence alignment algorithms with structure alignments.
Sequence alignment programs such as BLAST and PSI-BLAST are used routinely in pairwise, profile-based, or intermediate-sequence-search (ISS) methods to detect remote homologies for the purposes of fold assignment and comparative modeling. Yet, the sequence alignment quality of these methods at low sequence identity is not known. We have used the CE structure alignment program (Shindyalov and Bo...
متن کاملA Network of Hidden Markov Models and Its Analysis
The Structural Classification of Proteins (SCOP) database uses a large number of hidden Markov models (HMMs) to represent families and superfamilies composed of proteins that presumably share the same evolutionary origin. However, how the HMMs are related to one another has not been examined before. In this work, taking into account the processes used to build the HMMs, we propose a working hyp...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Nucleic acids research
دوره 30 1 شماره
صفحات -
تاریخ انتشار 2002